The multi-resolution extended edit distance

نویسندگان

  • Muhammad Marwan Muhammad Fuad
  • Pierre-François Marteau
چکیده

Similarity search is a fundamental problem in information technology. The main difficulty of this problem is the high dimensionality of the data objects. In large time series databases, it’s important to reduce the dimensionality of these data objects, so that we can manage them. Symbolic representation is a promising technique of dimensionality reduction. In this paper we propose a new distance metric, which is applied to symbolic sequential data objects, and we test it on time series databases in classification task experiments. We also compare it to other distances that are well known in the literature for symbolic data objects, and we prove that it’s metric.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parameter-Free Extended Edit Distance

The edit distance is the most famous distance to compute the similarity between two strings of characters. The main drawback of the edit distance is that it is based on local procedures which reflect only a local view of similarity. To remedy this problem we presented in a previous work the extended edit distance, which adds a global view of similarity between two strings. However, the extended...

متن کامل

Mining Transliterations from Wikipedia using Dynamic Bayesian Networks

Transliteration mining is aimed at building high quality multi-lingual named entity (NE) lexicons for improving performance in various Natural Language Processing (NLP) tasks including Machine Translation (MT) and Cross Language Information Retrieval (CLIR). In this paper, we apply two Dynamic Bayesian network (DBN)-based edit distance (ED) approaches in mining transliteration pairs from Wikipe...

متن کامل

Efficient Algorithms for Approximate String Matching with Swaps (Extended Abstract)

Most research on the edit distance problem and the k-differences problem considered the set of edit operations consisting of changes, insertions, and deletions. In this paper we include the swap operation that interchanges two adjacent characters into the set of allowable edit operations, and we present an O(t min(m, n))-time algorithm for the extended edit distance problem, where t is the edit...

متن کامل

Map Edit Distance vs. Graph Edit Distance for Matching Images

Generalized maps are widely used to model the topology of nD objects (such as 2D or 3D images) by means of incidence and adjacency relationships between cells (0D vertices, 1D edges, 2D faces, 3D volumes, ...). We have introduced in [1] a map edit distance. This distance compares maps by means of a minimum cost sequence of edit operations that should be performed to transform a map into another...

متن کامل

FURY: Fuzzy Unification and Resolution Based on Edit Distance

We present a theoretically founded framework for fuzzy unification and resolution based on edit distance over trees. Our framework extends classical unification and resolution conservatively. We prove important properties of the framework and develop the FURY system, which implements the framework efficiently using dynamic programming. We evaluate the framework and system on a large problem in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008